Where the whale-things are?

David L Miller (@millerdl)

University of Melbourne

15 April 2016


Slides available at http://converged.yt

Who is this guy?

  • Currently:
    • NOAA Northeast Fisheries Science Center (sic), Woods Hole, MA
    • (“Honourary” (ha!) position at St Andrews)
  • Previously:
    • Centre for Research into Ecological and Environmental Modelling, St Andrews
    • University of Rhode Island
    • PhD Bath, UK; spatial modelling w/ Simon Wood
  • Why am I here?
    • Collab w/ Mark Bravington CSIRO, Nat Kelly Antarctic Division
not a real statistician,
not a real ecologist,
knows nothing about whales.

with that in mind...

(a talk)

Abundance and distribution: why?

  • Want to know where they are
    • whale watching/stock assessments
  • Want to know where they aren't
    • Navy/seismic/oil/shipping/fishing etc
  • Want to know what they like/dislike
    • ecology/biology

How? Big white boats

How? Big white boats

Can't we just count them all?

Detectability

Distribution

Examples in this talk

Sperm whales off the US east coast

Example data (thanks NOAA!)

Let's talk about detectability

Detectability

Distance sampling

  • “Fit to the histogram”
  • Model:

\[ \mathbb{P} \left[ \text{animal detected } \vert \text{ animal at distance } y\right] = g(y;\boldsymbol{\theta}) \]

  • Calculate the average probability of detection:

\[ \hat{p} = \frac{1}{w} \int_0^w g(y; \boldsymbol{\hat{\theta}}) \text{d}y \]

Distance sampling (extensions)

  • Covariates that affect detectability (Marques et al, 2007)
  • Perception bias (\( g(0)<1 \)) (Burt et al, 2014)
  • Availability bias (Winiarski et al, 2013; Borchers et al, 2013)
  • Detection function formulations (Miller and Thomas, 2015)
  • Measurement error (Marques, 2004)  

Figure from Marques et al (2007)

Horvitz-Thompson-like estimators

  • Rescale the (flat) density and extrapolate

\[ \hat{N} = \frac{\text{study area}}{\text{survey area}}\sum_{i=1}^n \frac{s_i}{\hat{p}_i} \]

(where \( s_i \) are group/cluster sizes)

That's not really how the ocean works

Model-based abundance estimation

Rather than hoping our design detects to the changes in density and contriving a complex set of strata, use a model!

  • a priori, no idea how “bad” your design is \( \Rightarrow \) use a model.
  • a posteriori, no idea how “bad” your design was \( \Rightarrow \) use a model.

Density surface models

Hedley and Buckland (2004)

Miller et al. (2013)

DSM process flow diagram

2 (or more)-stage models

  • Detectability \( \Rightarrow \) effective area surveyed (offset in model)
  • Multi-stage models are handy!
  • Understand (and check) each part
  • Split your modelling efforts amongst people

Generalised additive models (in 1 slide)

Taking the previous example…

\[ n_j = \color{red}{A_j}\color{blue}{\hat{p}_j} \color{green}{\exp}\left[\color{grey}{ \beta_0 + \sum_k f_k(z_{kj})} \right] + \epsilon_j \]

where \( \epsilon_j \sim N(0, \sigma^2) \), \( \quad n_j\sim \) count distribution

  • \( \color{red}{\text{area of segment}} \)
  • \( \color{blue}{\text{probability of detection in segment}} \)
  • \( \color{green}{\text{(inverse) link function}} \)
  • \( \color{grey}{\text{model terms}} \)

SPOILER ALERT: your model is probably just a very fancy GLM

Why GAMs are cool...

  • Smooth terms
  • Fancy smooths
  • Fancy responses
  • Random effects (by equivalence)
  • Correlation structures
  • See Wood (2006) for a handy intro

Count distributions

plot of chunk countshist

  • Response is a count (not not always integer)
  • Often, it's mostly zero (that's complicated)
  • Want response distribution that deals with that
  • Flexible mean-variance relationship

Tweedie distribution

plot of chunk tweedie

  • \( \text{Var}\left(\text{count}\right) = \phi\mathbb{E}(\text{count})^q \)
  • Common distributions are sub-cases:
    • \( q=1 \Rightarrow \) Poisson
    • \( q=2 \Rightarrow \) Gamma
    • \( q=3 \Rightarrow \) Normal
  • We are interested in \( 1 < q < 2 \)
  • (here \( q = 1.2, 1.3, \ldots, 1.9 \))

BREAK

Silly things I've worked on lately...

emoGG

plot of chunk emogg

  • Use emojis in (gg)plots
  • (Sounds silly but actually quite useful!)




github.com/dill

beyonce

END OF BREAK

Let's fit a model

library(dsm)
dsm_env_tw <- dsm(count~s(Depth) + s(NPP) + s(SST),
                  ddf.obj=df_hr,
                  segment.data=segs, observation.data=obs,
                  family=tw(), method="REML")

(method="REML" uses REML to select wigglyness; Wood, 2011)

dsm is based on mgcv by Simon Wood

plot of chunk seconddsm

So we're done? Nope.

  • Need to make smooths “wiggly enough”
  • Term selection
    • shrinkage (Marra & Wood, 2011)
    • \( p \) values (Marra & Wood, 2012)
  • Residuals check plots (Dunn and Smyth, 1996)

Tobler's first law of geography

“Everything is related to everything else, but near things are more related than distant things”

Tobler (1970)

Implications of Tobler's law

plot of chunk pairrrrs

What can we do about this?

  • Careful inclusion of terms
  • Fit models using robust criteria (REML)
  • Test for concurvity (mgcv::concurvity)
  • Test for sensitivity (lots of models)

What is the "right" model?




¯\_(ツ)_/¯

Let's talk about prediction

Predictions over arbitrary areas

plot of chunk predplot

  • Don't want to be restricted in where to predict
    • Predict within survey area
    • Extrapolate outside (with caution)
  • Working on a grid of cells

Okay, but we're actually doing statistics

Variance

  • Need to propagate uncertainty!
  • This is hard, but can be done!
  • Refit model with “extra” term

\[ n_j = A_j p_j \exp \left\{ \color{red}{\left[\frac{ \partial \log p_j(\boldsymbol{\theta})}{\partial\boldsymbol{\theta}} \Big\vert_{\boldsymbol{\theta} = \hat{\boldsymbol{\theta}}}\right] \boldsymbol{\gamma}} + \beta_0 + \sum_k f_k(z_{jk}) \right\} \]

  • random effect – fix the corresponding variance matrix \( \boldsymbol{\gamma} \sim N(0,-\mathbf{H}^{-1}_\theta) \)
  • Bravington, Hedley & Miller (in prep)

Plotting uncertainty

plot of chunk vars

  • Maps of coefficient of variation (probably biased?)
  • CV for given area (better)
  • This is hard

Communicating uncertainty

  • Are animations a good way to do this?
  • 100 simulated possible distributions
  • Some features (e.g. shelf, N-S gradient) stick out

Availability

Whales are never where you want them...

  • We can only see whales at the surface
  • What proportion of the time are they there?
    • Acoustics
    • DTAGs etc
    • Behavioural studies
  • Fixed correction to \( \hat{p} \)
  • Model via fancy Markov models (MMPP, HMM; Borchers et al, 2013)

Photo from University of St Andrews Library Special Collections

Conclusions

  • This methodology is general
    • Bears, birds, beer cans, Loch Ness monsters can all be modelled
  • Models are flexible!
    • Linear things, smooth things, random effect things (and more)
  • If you know GLMs, you can get started with DSMs
    • Mature theoretical basis, still lots to do
  • Active user community, active software development

Acknowledgements

  • St Andrews: Eric Rexstad, Len Thomas, Louise Burt, Laura Marshall, Steve Buckland, Sharon Hedley
  • Duke: Jason Roberts, Laura Mannocci
  • CSIRO: Mark Bravington
  • Antarctic Division: Natalie Kelly
  • Funding: University of St Andrews, State of Rhode Island, US Navy, NOAA, International Whaling Commission

Resources

Thanks!

Slides w/ references available at converged.yt

References

Borchers, D. L., Zucchini, W., Heide-Jørgensen, M. P., Cañadas, A., Langrock, R., Buckland, S. T., & Marques, T. A. (2013). Using hidden Markov models to deal with availability bias on line transect surveys. Biometrics, 69(3).
Burt, M. L., Borchers, D. L., Jenkins, K. J., & Marques, T. A. (2014). Using mark-recapture distance sampling methods on line transect surveys. Methods in Ecology and Evolution, 5(11).
Dunn, P. K., & Smyth, G. K. (1996). Randomized Quantile Residuals. Journal of Computational and Graphical Statistics, 5(3).
Hedley, S. L., & Buckland, S. T. (2004). Spatial models for line transect sampling. Journal of Agricultural, Biological, and Environmental Statistics, 9(2).
Marques, T. A. (2004). Predicting and correcting bias caused by measurement error in line transect sampling using multiplicative error models. Biometrics, 60(3).
Marques, T. A., Thomas, L., Fancy, S. G., & Buckland, S. T. (2007). Improving estimates of bird density using multiple-covariate distance sampling. The Auk, 124(4).
Marra, G., & Wood, S. N. (2011). Practical variable selection for generalized additive models. Computational Statistics and Data Analysis, 55(7).
Marra, G., & Wood, S. N. (2012). Coverage Properties of Confidence Intervals for Generalized Additive Model Components. Scandinavian Journal of Statistics, 39(1).
Miller, D. L., Burt, M. L., Rexstad, E. A., & Thomas, L. (2013). Spatial models for distance sampling data: recent developments and future directions. Methods in Ecology and Evolution, 4(11).
Miller, D. L., & Thomas, L. (2015). Mixture models for distance sampling detection functions. PLoS ONE.
Winiarski, K. J., Miller, D. L., Paton, P. W. C., & McWilliams, S. R. (2013). Spatially explicit model of wintering common loons: conservation implications. Marine Ecology Progress Series, 492.

Handy awkward question answers

Straight lines vs. interpolation

plot of chunk wiggles

  • Want a line that is “close” to all the data
  • Don't want interpolation – we know there is “error”
  • Balance between interpolation and “fit”

Smoothing parameter

plot of chunk wiggles-plot

Wigglyness by derivatives

Animation of derivatives

Making wigglyness matter

  • Integration of derivative (squared) gives wigglyness
  • Fit needs to be penalised
  • Penalty matrix gives the wigglyness
  • Estimate the \( \beta_k \) terms but penalise objective
    • “closeness to data” + penalty

Penalty matrix

  • For each \( b_k \) calculate the penalty
  • Penalty is a function of \( \beta \)
    • \( \lambda \beta^\text{T}S\beta \)
  • \( S \) calculated once
  • smoothing parameter (\( \lambda \)) dictates influence

How wiggly are things?

  • We can set basis complexity or “size” (\( k \))
    • Maximum wigglyness
  • Smooths have effective degrees of freedom (EDF)
  • EDF < \( k \)
  • Set \( k \) “large enough”

Don't throw away your residuals!

gam.check

plot of chunk unnamed-chunk-1

rqgam.check (Dunn and Smyth, 1996)

plot of chunk unnamed-chunk-2

Negative binomial

plot of chunk negbin

  • \( \text{Var}\left(\text{count}\right) = \) \( \mathbb{E}(\text{count}) + \kappa \mathbb{E}(\text{count})^2 \)
  • Estimate \( \kappa \)
  • Is quadratic relationship a “strong” assumption?
  • Similar to Poisson: \( \text{Var}\left(\text{count}\right) =\mathbb{E}(\text{count}) \)

Reproducible plots via stenography

  • Encode plot code within the image
  • via Rich FitzJohn's stegasaur package
# make a plot as a PNG file, but encode what generated it in the image
figuredout({plot(sample(100))}, "simpleplot.png")
# extract the information
cat(decode("simpleplot.png"))